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Abstract 

SARS coronavirus, SCV, has been recently responsible of a sudden and widespread infection which caused almost 800 victims. The 
limited amount of SCV protein structural information is partially responsible of the lack of specific drugs against the virus. Coronavirus 
helicases are very conserved and peculiar proteins which have been proposed as suitable targets for antiviral drugs, such as bananins, 
which have been recently shown to inhibit the SCV helicase in vitro. Here, the quaternary structure of SCV helicase has been predicted, 
which will provide a solid foundation for the rational design of other antiviral helicase inhibitors. 

© 2006 Elsevier Inc. All rights reserved. 

Keywords: SARS coronavirus; Protein structure; Structure prediction; Molecular modeling; Helicase structure; Drug design 


Severe acute respiratory syndrome coronavirus (SCV) 
was the protagonist of a recent severe world-wide spread 
infection which caused almost 800 deaths within few 
months between 2002 and 2003 [1]. Genomic investigations 
have yielded a wealth of information on SCV evolution in 
terms of sequence mutations [2-4] to understand mecha¬ 
nism of viral transmission between animals and humans. 
Since SCV is still circulating, so far almost asymptomati¬ 
cally in southern China, our preparedness against new dan¬ 
gerous coronavirus outbreaks should include the 
development of specific antiviral drugs which, at the 
moment, remain unavailable. In this respect, structural 
genomics can play an important role in gaining insight into 
viral protein targets. Inhibition of these targets may inter¬ 
fere with the metabolism of the infecting virus without 
strong side effects on patients. 

Helicase have been targeted for therapy in other viruses, 
suggesting that the SCV helicase may be a suitable target 
for new anti-SCV drugs [5-9]. The SCV helicase tertiary 
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structure, not yet experimentally solved, would be useful 
for designing specific inhibitors of this SCV enzyme. Here, 
we report a structure prediction procedure to obtain a reli¬ 
able molecular model of this viral protein. 

Materials and methods 

The sequence of SCV helicase (576 aminoacid residues with NCBI 
Accession No. NP_828870) [10] was aligned with that of representatives of 
the other groups of coronavirus. Sequence alignment of SCV helicase with 
other coronavirus helicases were obtained using program ClustalW v.1.8 
[11]. The ATP binding site motif was predicted with MotifScan software 
available at the Expasy web-site [12]. Secondary structure prediction and 
fold recognition studies were performed by using PsiPred v.2.1 [13] and 
Pyre v.0.2 [14], respectively. Using the fold recognition method the heli¬ 
case of SARS Coronavirus (SwissProt Accession ID P59641) [15] identifies 
the 1UAA pdb entry [16]. Model building of the major domain (494 
aminoacids) of SARS helicase was carried out using ClustalW v.1.8 [11] 
alignment between the target sequence and that of the template structure. 
Models were subsequently optimized according to secondary structure 
predictions. Substitution of aminoacid residues and modeling of insertions 
and deletions in the target structure were performed using Deep View v.3.7 
software [17]. The 3D models were optimized by a 900 step minimization 
run with AMBER [18] and finally validated with the PROCHECK v.3.5.4 
procedure [19] and PROSAII v.4.0 software [20]. 
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A model of the metal binding domain (MBD) was obtained using 
DYANA v.1.5 software [21]. The insertion of zinc atoms was carried out 
using GROM ACS v.3.2 [22] software using the force field ffgmx43a2 that 
allows a molecular minimization of this domain with the bound metal 
atoms. To investigate the structure of MBD and helicase domain, a 
docking simulation was performed using ESCHER software [23]. A first 
coarse run with a rotation step of 10° was carried out and approximately 
500 results were collected. Only those models that were consistent with the 
available experimental data were subjected to a second run with a rotation 
step of 2° between ± 20° of the starting position in order to refine the most 
probable structures. The major side-chain clashes were removed by a 900 
cycles minimization in the AMBER force Field [18]. 

Results and discussion 

A set of 35 different amino acid sequences of SCV heli¬ 
case proteins, collected from the NCBI protein database, 
was aligned using the ClustalW v.1.8 software [11] and 
among all the considered sequences pairwise identities of 
99% were observed. Thus, a representative consensus 
sequence was obtained for the SCV helicase, suggesting 
the presence of two separate domains, i.e., the helicase 
domain (Hel) and a metal binding domain (MBD). Such 
a combination of a putative MBD and a Hel domain in 
the same protein, as in Equine Arterivirus nsplO, has been 
found in a number of other viral and cellular proteins [24- 
26]. Accordingly, the nspl3 model building was performed 
in two separate steps, one for each domain of the enzyme, 
and a hybrid protein structure was assembled from the two 
modeled domains. 

A 3D model of the SCV helicase Hel domain, spanning 
residues 80-568, was made on the basis of the crystal struc¬ 
ture of the Escherichia coli Rep helicase. Both helicases 
belong to the same Superfamily 1 helicase classification 
on the basis of seven conserved sequence motifs [27]. 
Therefore, the Protein Data Bank entry 1UAA [16], corre¬ 
sponding to E. coli Rep helicase crystal structure, was used 
as a suitable template for a model building of the Hel 
domain using the threading program Pyre v.0.2 [14]. A 
manual alignment between the secondary structure ele¬ 
ments predicted by shuffled PsiPred runs [13] and the ones 
observed in the model enhanced the correspondence of 
identical and positively conserved residues, respectively to 
11% and 23%. 

As far as the MBD structure is concerned, it should be 
noted that all coronaviruses exhibit very different amino 
acid sequences. However, coronavirus MBDs, as well as 
all the other nidoviruses, have a Cys and His rich motif 
which can orient the overall modeling procedure of this 
domain. In fact, arterivirus helicases have been shown to 
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possess MBD domains which coordinate four Zn~ 
[28,29] in a way that could be similarly adopted by coro¬ 
naviruses. In this Zn 2+ binding domain, known also as 
binuclear cluster, a Cys residue coordinates two closely 
spaced Zn 2+ ions bridging a Zn 2 Cys 6 group, typically 
found in GAL4-like proteins [30] with Zn 2 Cys 4 His 2 , found 
in RAG1 domain [31]. The SCV MBD structure has been 
reconstructed using tetrahedral geometry characterization, 
distance constraints, and orientations typical of Cys and 


His residues of the latter clusters, suggested by the crystal 
structures of GAL4-like and RAG1. The corresponding 
Protein Data Bank files, 1PYI, and 1RMD, respectively 
[30,31], provided accurate distance constraints for the 
two components of the MBD. 

The reliability of the predicted tertiary structures of 
both SCV helicase domains was tested on the basis of sim¬ 
ple physico-chemical rules, such as the ones suggested by 
Ramachandran plots, and the agreement between hydrop¬ 
athy profile and residue accessibility in the two modeled 
structures. Additional features which support the predicted 
structure of the Hel domain are given by the fact that 10 
out of a total 15 cysteine residues, which are predicted to 
form cystine bonds, are involved in disulphide bridges. 
After refinements of loop regions and manual tracing of 
side-chains for both Hel and MBD models no severe disal¬ 
lowed atomic contact was detected with PROCHECK 
v.3.5.4 [19], suggesting an essentially good stereochemistry, 
with 63% and 29% of the residues in the most favored and 
additional allowed regions, respectively, and with 4.8% and 
3.2% residues in generously allowed and disallowed regions 
of the Ramachandran plot. 

Once the models of the Hel and MBD domains had been 
successfully predicted, a molecular docking simulation was 
carried out to obtain the complete nspl3 tertiary structure 
in an unbiased way. Among the lowest energy solutions, 
only the ones consistent with the overall protein sequence 
were taken into account and manually refined before a final 
energy minimization. The theoretical structure of SCV 
nspl3, shown in Fig. 1, was again confirmed by 



Fig. 1. Ribbon representations of SCV nspl3 tertiary structure. In blue 
the MBD domains is shown and in red and violet the P-loop and zinc 
atoms are highlighted. (For interpretation of the references to color in this 
figure legend, the reader is referred to the web version of this paper.) 
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Fig. 2. The difference of the energy profile obtained by using ProSall program package for the entire nspl3 proteins and MBD is shown in (A). (B) The 
same difference is calculated for the Hel domain. 


PROCHECK v.3.5.4 [19] and deposited in the Protein 
Data Bank with the ID code 2G1F. 

Reliability of protein structures may be conveniently 
assessed by PROSAII program package [20] and, accord¬ 
ingly, the predicted single components of the SCV helicase 
as well as the final tertiary structure have been considered 
by this program package. In Fig. 2, we compared the ener¬ 
gy profiles obtained with PROSAII program package [20] 
for the isolated Hel and MBD domains with the one calcu¬ 
lated for the entire SCV helicase structure. From the nega¬ 
tive values of the energy, which have been obtained by 
subtracting the energy profile of the two separate domains 
from the one calculated for the final protein structure, see 
Fig. 2, it is apparent that higher stability is achieved when 
the two domains are assembled. Negative peaks in the lat¬ 
ter profiles, indeed, are found for the interface regions of 
the two domains, i.e., 42^18 residues for MBD and 198— 
204, 325-335, and 353-360 residues for the Hel domain, 
respectively. 

Since a typical ATP binding site motif between residues 
282 and 289 of the helicase sequence may be predicted, it is 
noteworthy that in the obtained nspl3 model the corre¬ 
sponding fragment, the so-called P-loop, is surface exposed 
and located between a (3 strand and a short a helix, exactly 
as this active site should be [32], see Fig. 1. 

We conclude that a theoretical model of SCV helicase is 
here proposed which will provide a foundation for the 
rational design of helicase inhibitors that could potentially 
act as anti-SCV drugs. 
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